Modeling lexical stress in continuous speech recognition for Dutch

نویسندگان

  • Henk van den Heuvel
  • David van Kuijk
  • Lou Boves
چکیده

The acoustic realization of vowels with lexical stress generally differs substantially from their unstressed counterparts, which are more reduced in spectral quality, shorter in duration, weaker in intensity and tend to have a flatter spectral tilt. Therefore, in a continuous speech recognizer (CSR) it would appear profitable to train separate models for the stressed and unstressed variants of each vowel. In the experiments reported on here, we applied stress modeling in both training and testing of the recognizer. Recognition experiments on an independent test set showed that recognition rates did not improve by this use of stress in our CSR. However, if we swapped the stress markers in the recognition lexicon the recognition rates did significantly deteriorate. This demonstrated that the acoustic models for the stressed and unstressed variants of the vowels were different. A pitfall in this experiment was that lexical stress information and phonemic context were possibly confounded. In a follow-up experiment we controlled for context by using generalized context-dependent models. In this experiment the recognition results were not improved either, although the vowel models were better tailored to capture lexical stressrelated information. We conclude that the mapping of lexical stress to the acoustic surface of fluent speech is not sufficiently straightforward to be of direct benefit for CSR, due to interaction of lexical stress with rhythm and sentence accent in real speech. 2002 Elsevier Science B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexical stress in continuous speech recognition

Human listeners use lexical stress for word segmentation and disambiguation. We look into using lexical stress for largevocabulary speech recognition for the Dutch language. It appears that beside vowels, consonants should be taken into account. By introducing stressed phonemes, and features for spectral bands and the fundamental frequency, we reduce the word error rate by 2.6 %.

متن کامل

Using lexical stress in continuous speech recognition for dutch

The acoustic realization of vowels with lexical stress generally differs substantially from their unstressed counterparts, which are more reduced in spectral quality, shorter in duration, weaker in intensity and tend to have a flatter spectral tilt. Therefore, in an automatic speech recognizer it would appear profitable to train separate models for the stressed and unstressed variants of each v...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Modelling Lexical Stress

Human listeners use lexical stress for word segmentation and disambiguation. We look into using lexical stress for speech recognition by examining a Dutch-language corpus. We propose that different spectral features are needed for different phonemes and that, besides vowels, consonants should be taken into account.

متن کامل

Visual lexical stress information in audiovisual spoken-word recognition

Listeners use suprasegmental auditory lexical stress information to resolve the competition words engage in during spoken-word recognition. The present study investigated whether (a) visual speech provides lexical stress information, and, more importantly, (b) whether this visual lexical stress information is used to resolve lexical competition. Dutch word pairs that differ in the lexical stres...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2003